An advanced computational intelligent framework to predict shear sonic velocity with application to mechanical rock classification

Shear sonic wave velocity (Vs) has a wide variety of implications, from reservoir management and development to geomechanical and geophysical studies. In the current study, two approaches were adopted to predict shear sonic wave velocities (Vs) from several petrophysical well logs, including gamma ray (GR), density (RHOB), neutron (NPHI), and compressional sonic wave velocity (Vp). For this purpose, five intelligent models of random forest (RF), extra tree (ET), Gaussian process regression (GPR), and the integration of adaptive neuro fuzzy inference system (ANFIS) with differential evolution (DE) and imperialist competitive algorithm (ICA) optimizers were implemented. In the first approach, the target was estimated based only on Vp, and the second scenario predicted Vs from the integration of Vp, GR, RHOB, and NPHI inputs. In each scenario, 8061 data points belonging to an oilfield located in the southwest of Iran were investigated. The ET model showed a lower average absolute percent relative error (AAPRE) compared to other models for both approaches. Considering the first approach in which the Vp was the only input, the obtained AAPRE values for RF, ET, GPR, ANFIS + DE, and ANFIS + ICA models are 1.54%, 1.34%, 1.54%, 1.56%, and 1.57%, respectively. In the second scenario, the achieved AAPRE values for RF, ET, GPR, ANFIS + DE, and ANFIS + ICA models are 1.25%, 1.03%, 1.16%, 1.63%, and 1.49%, respectively. The Williams plot proved the validity of both one-input and four-inputs ET model. Regarding the ET model constructed based on only one variable,Williams plot interestingly showed that all 8061 data points are valid data. Also, the outcome of the Leverage approach for the ET model designed with four inputs highlighted that there are only 240 “out of leverage” data sets. In addition, only 169 data are suspected. Also, the sensitivity analysis results typified that the Vp has a higher effect on the target parameter (Vs) than other implemented inputs. Overall, the second scenario demonstrated more satisfactory Vs predictions due to the lower obtained errors of its developed models. Finally, the two ET models with the linear regression model, which is of high interest to the industry, were applied to diagnose candidate layers along the formation for hydraulic fracturing. While the linear regression model fails to accurately trace variations of rock properties, the intelligent models successfully detect brittle intervals consistent with field measurements.


Data collection and processing
The candidate formation for this study is Sarvak carbonates of an oilfield in the southwest of Iran. The Sarvak formation, mainly composed of limestones, serves as a major oil-producing reservoir in this region. A variety of sedimentary features has been distinguished in Sarvak 35 , with the secondary porosity evaluated to range from 0 to 10% in the study area, implying a high degree of heterogeneity. The formation has an approximate thickness of 600 m, which is divided into upper and lower Sarvak layers separated by the 34-m-thick Ahmadi member. More than ten boreholes have been drilled to develop this reservoir, but only one well has full-waveform measurements registered by the Schlumberger DSI tool. Besides, conventional well logs, such as NPHI, RHOB, and GR, are available. The data set includes a total of 4048 data points, regularly recorded at depth intervals of 15.24 cm. Well logs are depth matched and then subjected to environmental and hole size corrections.
The lack of shear velocity measurements has posed significant challenges in conducting geomechanical studies in this area and motivated us to develop a robust predictive model. In the first step, the selection of input variables is of paramount significance. To this end, we seek physically sound relationships between shear velocity (Vs) as the output and other logging data as the inputs. Sonic velocities in carbonate rocks were found to depend primarily on mineralogy and, more importantly, the amount and type of porosity 36,37 . In formation evaluation, a combination of V p , GR, RHOB, and NPHI are frequently used for a detailed assessment of mineral contents and rock porosity. We establish two sets of predictive models: first, by using only V p as the input parameter and then adopting the four well logs as the model variables, from now on referred to as one input and four inputs Scientific Reports | (2022) 12:5579 | https://doi.org/10.1038/s41598-022-08864-z www.nature.com/scientificreports/ models, respectively. The reason for developing the former group is to find out how reliable these simple and widely used models are to directly bridge between compressional and shear velocities.

Model development and performance assessment
Modeling approaches. Gaussian process regression (GPR). It was the late 1940s in which the Gaussian Process method was suggested and implemented for prediction purposes. This technique found its way into machine learning in the middle of the 1990s 38 . After that, numerous computer simulations tests were performed and confirmed the Gaussian Process (GP) method's high efficiency. One important positive point of Gaussian Process Regression (GPR) is its high power in processing multi-dimension, a limited number of samples, and non-linear difficulties 39 . Generally, a GP is a group of random variables in which a restricted number of these variables have a joint Gaussian scattering. A Gaussian Process (GP) is identified through a mean function and a positively defined covariance (kernel) function 40 . Given a group of inputs D = x i , y i , i = 1, 2, . . . , n , x i ∈ R d , and y i ∈ R. The mean function is determined through: Covariance function is given by: In which: x , x ′ ϵ R d , and it is required to estimate f (x * ) for the testing data x * , after that, the GP could be given as: Because of the regression type of difficulty, the model is defined as below 41 : Affecting ξ − N(0, σ 2 y ) subsequently, the previous distribution of observed value y is given.
The previous combination distribution of noted value y and estimated f (x * ) 41 : K (X, X) = K n = K ij , it is n × n sequence positive definite matrix, the element of the K ij = K (x i , x j ) is implemented to calculate the correlation between x i and x j. K (X,x * )=K(x * , X) −1 is an n × 1 sequence covariance matrix between testing data x * and training samples X. K ( x * , x * ) shows the covariance of the test data; I n represents n dimensions unit matrix 41 .
Accordingly, the posterior distribution of estimated value f (x * ) is achieved as below 41 : where: µ * , * shows the mean and covariance of f (x * ).

Kernel function.
The key role of kernel or covariance functions in the Gaussian process is controlling GPR's accuracy. The employed kernel function in the current study is automatic relevance determination (ARD) exponential.
Random forest (RF). RF is made up of a series of decision trees that are used to train trees concurrently. This method uses the efficiency of decision trees as the final choice for its model 42 . The RF classifier's unique built-in feature selection attribute enables it to control a variety of input features without eliminating specific variables to minimize dimensionality 43 . The RF approach trains the classifier to use bootstrap aggregation (Bagging) to broaden the range of each tree in the forest. Markedly, the number of trees B is selected. B separates training data points from the core data according to this amount. Since bagging is viewed as an alternative for random sampling, around one-third of the database is unused to train each subtree. Any tree's residual data is known as the "out-of-bag" data point (OOB) 44 .
In the RF method, due to the fact that the OOB may be applied to examine the model's efficiency by examining the OOB errors, cross-validation is not required 45 . For the training of any decision tree, it is mandatory to record the training sample for the tree. Suppose the training set as K(x * , x * ) And the error generalization for OOB data becomes: The randomness operation of the RF is controlled by the value K, which is typically specified as k = log 2 d 45 . To determine the feature worth of each component X i , the factor is randomly quantized. The bellows value is used to quantify the relevance of a feature: Here, X i denotes the permuted ith feature in the feature vector X, B suggests the percentage of trees in the RF, and OOBerr t i symbolizes the method forecast error for the perturbed OOB sample containing the permuted feature X i for tree t . OOBerr t refers to the original OOB data sample that contains the permuted component.
The importance of the permutation feature signifies that an incredible importance quantity highlights that the feature is applicable in the estimation, and permuting the feature variable influences the model prediction. A minimal beneficial feature has no or little effect on the approximation of the system 46 . It should be noticed that the minimum leaf size and parent size for the constructed RF model were set to 1 and 19, respectively.

Extra tree (ET) .
ET is a method of learning that applies an averaging strategy on Decision Tree projections in order to improve correctness and reduce processing complexity 47,48 . The additional tree strategy generates a random set of trees. Their estimates are retrieved accurately, using arithmetic averaging in regression challenges and majority voting in classification issues. One significant distinction between the extra tree method and other tree-based machine learning algorithms is that neuron division occurs randomly via extra tree cut sites.
The trees are built in the opposite direction of a bootstrap replica, using the entire learning sample. In regression challenges, the procedure of extra tree splitting requires two key variables: (i) the frequency of random splits at each neuron, denoted by K, and (ii) the smallest size of the sample utilized to break a neuron, written by n min 47,48 . The additional tree algorithm grows trees by identifying the amount of K at each neuron and continuing this operation once leaves are reached. Unless all subsamples provide pure responses or the amount of learning samples is below n min 48 . all subsamples produce pure responses. Extra trees are projected to adequately reduce variation by randomly assigning cut points and input features and by group averaging. Nonetheless, bias minimization can be accomplished by adding additional trees that utilize the complete original learning sample 47 .
In formal terms, provided a training data, X = {x 1 .x 2 . . . . .x N }, where the sample x i = {f 1 .f 2. . . . f D } f j as the feature and jǫ{1.2. . . . .D} . Extra trees generate M unique DTs. In every DT, S p indicates a portion of the training data X at child neuron p. Following that, the ETs algorithm selects the optimal split relating to S p and a random segment of features for each neuron p 49 . It should be noted that the minimum leaf size and parent size for the developed ET model were set to 1 and 5, respectively.
Adaptive neuro fuzzy inference system (ANFIS). ANFIS, a widely used strategy for machine learning, combines neural networks with fuzzy systems. ANFIS's primary purpose is to alleviate the constraints of neural networks and fuzzy systems while maximizing the positive points of both methodologies.
ANFIS utilizes the ANN learning procedure to derive rules from input and output data, resulting in the creation of a self-adaptive neural fuzzy system 50 . In general, three functions are available for building fuzzy systems: genfis1, genfis2, and genfis3 51 . The genfis3 was used in the current report. The FIS framework is also constructed using a Sugeno system based on fuzzy C-means (FCM) clustering. Additionally, in fuzzy systems, membership functions may be chosen from a variety of functions 52 . In the current research, a Gaussian function was applied. The ANFIS and ANN training in this work were accomplished using a hybrid technique. This technique combines backpropagation and least-squares prediction. The input membership function elements are computed using backpropagation, while the output membership function factors are measured using the least-squares methodology.
ANFIS's architecture is composed of rules, input data, output membership functions, and membership degree functions. Fig. 1 illustrates the ANFIS design with two inputs. The first layer establishes each input's reliance on distant fuzzy areas. The next layer increases the weight of rules (w i ) by raising the input numbers of each neuron. In the third step, the comparative weight of rules is determined. In the fourth stage, neurons are used to determine the contribution of rules to the output. The final layer, consisting a single neuron described as a stable neuron. 53 , is used to minimize the variance between the observed and forecasted output 54 . As previously stated, the ANFIS paradigm is composed of five layers. The precise characteristics of each layer are listed below [55][56][57][58] . In this research, for the designed ANFIS model by one input, the number of nodes and fuzzy roles were defined 16 and 3, respectively. However, the number of nodes and fuzzy roles for the formed ANFIS model by four inputs were set to 57 and 5, respectively. www.nature.com/scientificreports/ Layer 1: Layer 1 converts the incoming data to language terms. Each input criterion is associated with n neurons, each of which represents a preset linguistic phrase. The terms are produced in the initial layer in accordance with the previously specified membership functions. The Gaussian function used in this investigation is shown below: Z signifies the Gaussian membership function center in this calculation; O denotes the output layer; and σ reflects the variance term. The ANFIS program will optimize and alter these parameters during the learning period 59,60 .
Layer 2: 59,60 : Layer 3: In this layer, the firing energy of each rule is distinguished from the overall firing capacity of all rules by normalizing the recorded firing power parameters using the following equation 59 In this formula, r i ,n i , and m i denotes linear variables. The adjustment and optimization of these variables are performed through ANFIS by the reduction of the discrepancy between predicted and target quantities 59,60 .
Layer 5: This layer use the weighted average summation technique to convert the complete collection of rules and an output to a numerical state according to the below calculation 59,60 : Optimization algorithms. Imperialist competitive algorithm (ICA). ICA is a powerful technique based upon imperialism to expand the strength and law of a government far away from its geographical borders 61 . A first population starts this method as first countries-several best countries among the existing population regarded as the imperialists. Indeed, those countries with the minimum objective functions or costs, as an example, root mean square error (RMSE), are selected as the imperialists 62 . The remaining population is considered as colonies and incorporated in the mentioned imperialists. After that, imperialistic competition starts between all the empires. Among the empires, the weakest one (with maximum RMSE) who is disabled to raise its strength and is disabled to succeed in the competition will be deleted from the competition. Thus, all colonies go toward their related imperialists associated with the competition between empires. In the end, hopefully, the mechanism of collapse will lead to reaching all the countries to a state where there is merely one empire around the globe (in the context of the issue), and all the other countries are colonies of that one empire. The most potent empire (with minimum RMSE) would be our remedy 63 .
Differential evolution (DE) optimizer. The DE optimizer is a swarm-based stochastic optimized defined by Storn and Price 64 . This practical algorithm has several merits: real coding, user-friendly, local searching feature, simplicity, and high speed 65,66 . The algorithm operates through the same computational processes employed by other evolutionary algorithms. The differential evolution algorithm utilizes the dissimilarity of the parameter vectors for exploring the objective space 67 .

Statistical evaluation.
To show and compare the constructed models, several parameters, namely average percent relative error (APRE%), average absolute percent relative error (AAPRE%), root mean square error (RMSE), and standard deviation (SD), were implemented. Formulas of these equations are provided below: 1. Average percent relative error (APRE): In which Ei is the relative deviation that is defined as: 2. Average absolute percent relative error (AAPRE): 3. Standard deviation (SD):

Root mean square error (RMSE):
In addition, the relevancy factor (r) was calculated to analyze the relationship between the inputs and outputs. The following formula was applied to calculate the relevancy factor (r) for input data: While output i highlights the value of ith estimated output, output ave implies the mean value of approximated output. Input k,i displays the ith quantity of the kth input factor, while Input ave,k displays the mean amount of the kth input variable 68 .

Results and discussion
Assessment of the validity and accuracy of one input-developed models. Table 1 summarizes the obtained values of the parameters mentioned above for train, test, and total datasets in which one variable (V p ) has been used as the input. As given in this table and Figs. 2 and 3, the smallest overall AAPRE (1.34%), RMSE (57.99), and standard deviation (0.019) belong to the extra tree (ET) model. After the extra tree model, the Gaussian process regression (GPR) indicates low values of overall AAPRE (1.54%) and RMSE (66.25). It is worth mentioning that the developed methods of Gaussian process regression (GPR) and random forest (RF) have closely similar AAPRE and RMSE values (Figs. 2 and 3). Likewise, a relatively similar performance for these two models can be concluded, based on the achieved values of AAPRE and RMSE. Collectively, the extra tree (ET) model can be regarded as the optimum model that estimated the target with substantially higher accuracy than those of the other models in the current study. The performance of the models based on the achieved error values can be summarized as below:      Fig. 6 depicts the cumulative frequency of the absolute relative error for the models used in this study. As indicated by this chart, the ET model is capable of approximating higher than 30% of Vs points with an absolute relative error of below 0.5 percent. Additionally, roughly 90% of the estimated Vs values through the ET model show an absolute relative error of lower than 3%. Correspondingly, The ET model's superior performance in predicting the Vs in contrast to other approaches can be deduced.

Outlier detection and utility domain of the constructed ET model (one input model).
Outlier detection is a timeefficient method for finding a data set that is distinct from the rest of the data in a databank 69 . The Leverage technique is a well-known methodology for detecting outliers, as it is based on data residuals (the departure of a model's expectations from experimental findings) [69][70][71][72] . A hat matrix (H) is given in the leverage approach to establish the hat indexes or leverage of data as follows 69,73 : In which, X denotes a two-dimensional matrix containing N rows (data sets) and K columns (model features). Furthermore, T represents the transpose multiplier. The diagonal components of H typify the hat values of data 69,73 .
In a Williams plot, standardized residuals are plotted against hat values and various areas of out of leverage data, suspected data, and valid data are recognized. The standardized residuals' formula (SR) for each data point is described as bellows 73 : In which e i represents the deviation of the estimated data from its experimental value (estimated output-measured data), RMSE stands for the root mean square error of the model, and H ii denotes the hat index of the ith data set.
In the leverage approach, warning leverage (H * ) is determined to reject or accept the model results and calculations. This criterion is known as H * = 3(k+1) N and commonly, a value of 3 with an SD of ±3 from the mean is selected to cover 99% of the dispersed data. Under the circumstances in which most of data sets end up within the intervals of 0 ≤ H ii ≤ H * and −3 ≤ SR i ≤ 3 , it may be inferred that the proposed model and its approximations are valid, and the experimental data implemented for model development are reliable 69,73 .
The data points in the ranges of −3 ≤ SR ≤ 3 and H * ≤ H are known as good high leverage points. These points are outside the applicability area of the used model. The data sets that are situated in the interval of SR ˂ −3 or SR > 3 (notwithstanding their H value) are known as bad high leverage points. These data points are regarded as experimentally suspected data set that may be derived from an error over the experimental calculations 69,73 . Figure 7 depicts the Williams plot and notably implies that all 8061 data points are valid data.   Table 2 summarizes the achieved values of the statistical parameters for train, test, and total datasets. As given in this table  . Similar to the constructed models based on one input, a closely similar performance can be concluded for the Gaussian process regression (GPR) and random forest (RF) models due to their subtle differences in acquired AAPRE and RMSE. Likewise, it can be noticed that the optimization algorithms' performances do not differ considerably from each other. Therefore, the extra tree (ET) model can be recognized as the ideal model approximating the target (Vs) with higher accuracy than the other created models in this paper. The performance of the constructed models based on the acquired error values can be summarized as below:

Assessment of the validity and accuracy of four inputs-constructed models.
Even though the performance of the models developed with one input follows the above trend except for optimizers, generally lower error values have been obtained when models are developed with four inputs. Figure 11 shows the plots of the applied systems. From these cross plots, it is apparent that the predictions of the applied models generally demonstrate a highly satisfactory agreement with the straight line. However, it can be observed that the data set belonging to the extra tree (ET) model (Fig. 11c)   www.nature.com/scientificreports/ Illustrating the error distribution curve of the optimum model is another tool implemented to assess the developed models based on four inputs graphically. Figure 12 shows this curve for the developed extra tree (ET) as the ideal model. As it is visible, the major part of the data points has been situated near the zero line of the relative error (RE). This suggests the high accuracy of the developed extra tree (ET) model.
The cumulative frequency of the models' absolute relative error applied based on four inputs, and created correlation is depicted in Fig. 13. As this figure clarifies, the ET model could estimate approximately 93% of Vs points with an absolute relative error of less than 3%. Correspondingly, the ET model's superior effectiveness in forecasting Vs than other strategies can be concluded.

Sensitivity analysis of the ET model (four inputs model).
Sensitivity analysis investigates the effect of a model's input variation on the model's output value. In this regard, the relevancy factor is a proper method. The relevancy factor calculates the amount of each input parameter influence on the output. A higher value of relevancy factor (r) for an input indicates a more prominent effect by that input on the output 73 . Figure 14 typifies the effect of four inputs on the Vs as the target parameter in this research. It implies that the Vp has a considerably more significant influence on the Vs value in comparison with the other three inputs. Therefore, the generally similar performance of the one-input and four-input developed models based on the obtained errors can justify the sensitivity analysis outcome, denoting Vp used as the only input in the first scenario of this paper impose a higher impact on the Vs as the target parameter.

Outlier detection and utility domain of the constructed ET model (four inputs model).
The result of the Leverage approach for the extra tree (ET) model constructed with four inputs is demonstrated in Fig. 15. It is plainly visible that most data sets are situated in the valid zone, and there are only 240 out of 8060 "out of leverage" data sets. Additionally, only 169 out of 8060 data points are suspected data. These amounts prove that the experimental data are reliable and that the developed ET model is statistically valid. Figure 16 indicates the Group error distribution of four inputs within five divided sequences. For the Vp input case, as demonstrated in Fig. 16a, the smallest AAPRE within the interval of 4144 to 4691 belongs to the GPR model. The extra tree (ET) model shows a lower AAPRE than that of In the case of the second input (GR) (Fig. 16b), it is evident that for all five defined intervals, the extra tree (ET) model has the minimum AAPRE. Regarding RHOB (Fig. 16c), the extra tree (ET) model generally typifies the lower AAPRE compared to other models. However, for the last defined range of RHOB values (> 2.72), the Gaussian process regression (GPR) model implies a notable lower AAPRE than that of other models. Finally, considering the NPHI input (Fig. 16d), the extra tree (ET) model collectively shows lower AAPRE values than other developed intelligent models.

Group error analysis (four inputs models).
Implications to candidate selection for hydraulic fracturing. Hydraulic fracturing widely serves as an essential technique to enhance the productivity of low-permeability hydrocarbon reservoirs. Massive hydraulic fracturing involves the injection of large volumes of water at high pressure and rates, making economic production from gas shales of nano-darcy-range permeability viable 74,75 . However, not all depth intervals in the reservoir are appropriate for fracturing. Indeed, a promising selection of candidate layers for a fracturing completion is the key to ensure high profitability. The degree to which the rock is efficiently fractured to create a wide and sufficiently permeable fracture network for the hydrocarbon to flow is characterized by the brittleness index, BI 76,77 . Consequently, the literature has witnessed in recent years tremendous efforts to develop accurate and credible brittleness models (see, for instance, Kivi et al. 78   www.nature.com/scientificreports/ where the superscripts "min" and "max", respectively, stand for the least and highest elastic moduli values. The socalled elastic brittleness index has drawn widespread attention in field applications owing mainly to its simplicity and proficiency, proven through comparison with rock failure behavior in laboratory 17 and field observations 80,81 . Elastic moduli can be conveniently evaluated from wireline logging data, which is written as: where ρ[kg/m 3 ] , V p [m/s] and V s [m/s] denote the rock´s bulk density and compressional and shear sonic velocities, respectively. Equations (23) to (25) point to the importance of developing shear velocity proxies in optimizing the hydraulic fracturing operation where full-waveform sonic data are partially or thoroughly missing. where the velocities are in m/s. The developed correlation represents a high accuracy, characterized by an AAPRE and RMSE of 2.2 and 89.03, respectively, which are comparable to the values achieved from artificial intelligence models (see Tables 1 and 2). The resultant statistics seem to attest to the high precision of the constructed linear model. The reliability of the created models can also be inferred from the estimated profiles of Young´s modulus along with the examined formation (Fig. 17). The measured Young´s modulus tracks using modeled shear velocities (ET models and linear regression) and DSI data return almost a perfect match. However, discrepancies arise when comparing vertical distributions of Poisson´s ratio obtained from the mentioned three models (Fig. 17). The four-variable ET model estimates of shear velocity result in a Poisson´s ratio profile that is in good agreement with the actual one, i.e., calculated from DSI data. Although the single input ET model satisfactorily captures the general evolution trends of the Poisson´s ratio across the layer, a perfect quantitative match is missing. This comparison clearly highlights a key and complex dependence of sonic velocity on a set of contributing factors,  www.nature.com/scientificreports/ which a combination of well logging data can only realistically reflect this complexity. This inconsistency may not necessarily pose major uncertainties to our analysis because what matters in candidate selection for hydraulic fracturing is the relative sequence of brittleness and not its absolute value. Interestingly, despite admissible velocity measurement capabilities, the linear model completely fails to estimate Poisson´s ratio profile neither qualitatively nor quantitatively. This disagreement is evidently due to the fact that Poisson´s ratio only depends on the ratio of compressional to shear velocity (see Eq. 25). Accordingly, the smaller the absolute value of the velocity intercept, the smaller the variability of Poisson´s ratio. Thus, the closer its value is to a constant controlled by the velocity ratio. The derived linear relation here narrows down the variation of Poisson´s ratio to as small as 0.3 to 0.32 (Fig. 18).
We assessed the brittleness profiles along the formation using equation (33) and the predicted elastic moduli. We also employed the well-established k-means clustering technique 82 to develop a mechanical rock classification and diagnose rock classes of different brittleness ranges. After a trial-and-error procedure for clustering, we assumed four rock clusters for illustration purposes. One should bear in mind that the number of rock clusters should be determined based on the identified rock types through a detailed geological analysis of recovered cores and thin sections 83 . Furthermore, for a robust screening of sweet spots, the clustering should also take into account other affecting parameters such as rock porosity, permeability, saturation, and in-situ stresses, which is out of the scope of this study. As expected from elastic moduli predictions, a comparison of brittleness profiles and clusters associated with ET model evaluations and recorded velocities discloses a good agreement (Fig. 19). The lowermost 100 m of the formation and some scattered intervals in its middle and top (light and dark green clusters) are found to have relatively higher brittleness compared to the adjacent zones (purple and red clusters). Therefore, the former groups can be considered as target layers for hydraulic fracturing while the latter potentially act as fracture barriers. However, the regression-based brittleness estimate, inheriting errors from elastic parameter calculations, is not able to follow the overall trends, and the associated fracturing design would be misleading. Briefly, it can be concluded that using linear models to estimate the shear sonic velocity gives rise to certain uncertainties in evaluating the rock Poisson´s ratio and negatively impacts subsequent geo-mechanical studies. Hence, their application to fill in data gaps should be restricted or treated cautiously. Instead, the employed intelligent approaches provide powerful tools for velocity estimations and should be taken as common practice in the industry.

Conclusions
In this paper, two scenarios were adopted to estimate Vs from petrophysical well logs of GR, RHOB, NPHI, and Vp. For this objective, five different intelligent models of random forest (RF), extra tree (ET), Gaussian process regression (GPR), and the optimization of ANFIS with differential evolution (DE) and imperialist competitive algorithm (ICA) were employed. In the first scenario, the target was predicted based on only Vp and the extra tree (ET) model provided lower AAPRE than other intelligent models. Furthermore, cross plotting the approximated Vp against its measured values for the extra tree (ET) model showed more uniformity than other implemented models. The error distribution curve also typified the high accuracy of the extra tree (ET) model. The cumulative frequency of the absolute relative error further supported better performance of the extra tree (ET) model than that of other developed models. Notably, the Williams plot of the data sets illustrated that all 8061 data point are valid. Ultimately, the group error analysis proved that the extra tree (ET) developed model has a lower AAPRE within all divided data sets than other models.
The second scenario predicted Vs from the integration of Vp, GR, RHOB, and NPHI inputs. Like the first approach, the minimum AAPRE was acquired by the extra tree (ET) model in this approach. Likewise, the cross plot of experimental Vp values versus its approximated values through the ET constructed model indicated more uniformity than other models. More acceptable performance for the ET model was demonstrated by its error distribution curve and cumulative frequency of the absolute relative error. The leverage approach also suggested that both measured data and the developed ET model are statistically valid. Also, the sensitivity analysis outcome denoted that the Vp has a higher impact on the target parameter (Vs) than other used inputs. Generally, it can be concluded that the second approach is more acceptable because of the lower achieved errors of its constructed models.
The field applicability of ET models as the most accurate developed intelligent approach was verified and compared with the linear regression model. The ET models, particularly that of the second scenario, satisfactorily estimated elastic moduli profiles in close quantitative agreement with field measurements and diagnosed brittle layers for hydraulic fracturing along Sarvak formation. Interestingly, although of acceptable accuracy, the regression-based velocity profile led to pronounced uncertainties in evaluating the rock Poisson´s ratio and subsequent geo-mechanical evaluations, for instance, as discussed in this study, the relative sequence of brittle layers for hydraulic fracturing. This highlights the outperformance of the established intelligent frameworks for sonic velocity estimations and strongly suggests their wide employment in reservoir evaluation practices. Nevertheless, the choice of appropriate input well-logging variables, particularly when any of the conventional